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Abstract 

Suppose we wish to recover a vector xq € R m (e.g. a digital signal or image) from 
incomplete and contaminated observations y = Ax o + e; A is a n by m matrix with far 
fewer rows than columns (n -C m) and e is an error term. Is it possible to recover Xq 
accurately based on the data yl 

To recover cco, we consider the solution x$ to the U-regularization problem 
min H^lbj subject to || Ax - y\\^ < e, 

where e is the size of the error term e. We show that if A obeys a uniform uncertainty 
principle (with unit-normed columns) and if the vector xq is sufficiently sparse, then 
the solution is within the noise level 

lk # - zolk < C ■ e. 

As a first example, suppose that A is a Gaussian random matrix, then stable recovery 
occurs for almost all such A’s provided that the number of nonzeros of xo is of about the 
same order as the number of observations. As a second instance, suppose one observes 
few Fourier samples of xo, then stable recovery occurs for almost any set of n coefficients 
provided that the number of nonzeros is of the order of n/[logm] 6 . 

In the case where the error term vanishes, the recovery is of course exact, and this 
work actually provides novel insights on the exact recovery phenomenon discussed in 
earlier papers. The methodology also explains why one can also very nearly recover 
approximately sparse signals. 

Keywords. I\ -minimization, basis pursuit, restricted orthonormality, sparsity, singular 
values of random matrices. 
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1 Introduction 


1.1 Exact recovery of sparse signals 

Recent papers [2-5,10] have developed a series of powerful results about the exact recovery 
of a finite signal xo G M m from a very limited number of observations. As a representative 
result from this literature, consider the problem of recovering an unknown sparse signal 
xo(t ) G R m ; that is, a signal x’o whose support T$ = {t : xq( t) / 0} is assumed to have 
small cardinality. All we know about xq are n linear measurements of the form 

yk = {x 0l a k ) k = 1,. .. ,n or y = Ax 0 , 

where the a k G M m are known test signals. Of special interest is the vastly underdetermined 
case, n <C m, where there are many more unknowns than observations. At first glance, this 
may seem impossible. However, it turns out that one can actually recover xo exactly by 
solving the convex program 1 

(Pi) min ||x|| 4 subject to Ax = y, (1) 

provided that the matrix A G M nxm obeys a uniform uncertainty principle. 

The uniform uncertainty principle, introduced in [5] and refined in [4], essentially states that 
the n x m measurement matrix A obeys a “restricted isometry hypothesis.” To introduce 
this notion, let At, T C {1,..., m} be the n x |T| submatrix obtained by extracting the 
columns of A corresponding to the indices in T. Then [4] defines the P-restricted isometry 
constant 5s of A which is the smallest quantity such that 

(1 - 6 S ) ||c||| 2 < \\A T c\\j 2 < (1 + Ss) \\c\\j 2 (2) 

for all subsets T with |T| < S and coefficient sequences (cj)j^T- This property essentially 
requires that every set of columns with cardinality less than S approximately behaves like 
an orthonormal system. It was shown (also in [4]) that if S verifies 

+ $2S T ^3S < 1, (3) 

then solving (Pi) recovers any sparse signal xo with support size obeying |To| < S. 

1.2 Stable recovery from imperfect measurements 

This paper develops results for the “imperfect” (and far more realistic) scenarios where the 
measurements are noisy and the signal is not exactly sparse. Everyone would agree that 
in most practical situations, we cannot assume that Ax o is known with arbitrary precision. 
More appropriately, we will assume instead that one is given “noisy” data y = Ax o+e, where 
e is some unknown perturbation bounded by a known amount ||e||f 2 < e. To be broadly 
applicable, our recovery procedure must be stable: small changes in the observations should 
result in small changes in the recovery. This wish, however, may be quite hopeless. How can 
we possibly hope to recover our signal when not only the available information is severely 
incomplete but in addition, the few available observations are also inaccurate? 

1 (Pi) can even be recast as a linear program [6]. 
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Consider nevertheless (as in [12] for example) the convex program searching, among all 
signals consistent with the data y, for that with minimum ^i-norrn 

(P 2 ) min ||x||^ subject to \\Ax - y ||g 2 < e. (4) 

The first result of this paper shows that contrary to the belief expressed above, the solution 
to (P 2 ) recovers an unknown sparse object with an error at most proportional to the noise 
level. Our condition for stable recovery again involves the restricted isometry constants. 


Theorem 1 Let S be such that 63s + ^4s < 2. Then for any signal x$ supported on Tq 
with | To| < S and any perturbation e with ||e|| i 2 < e, the solution x$ to (P 2 ) obeys 

jM - xo\\e 2 < C s • e, (5) 

where the constant Cs may only depend on 64s- For reasonable values of 84s, Cs is well 
behaved; e.g. Cs ~ 8.82 for 84 s = 1/5 and Cs ~ 10.47 for 84 s = 1/4. 


It is interesting to note that for S obeying the condition of the theorem, the reconstruction 
from noiseless data is exact. It is quite possible that for some matrices A, this condition 
tolerates larger values of S than 0 . 

We would like to offer two comments. First, the matrix A is rectangular with many more 
columns than rows. As such, most of its singular values are zero. As emphasized earlier, 
the fact that the severely ill-posed matrix inversion keeps the perturbation from “blowing 
up” is rather remarkable and perhaps unexpected. 

Second, no recovery method can perform fundamentally better for arbitrary perturbations 
of size e. To see why this is true, suppose one had available an oracle letting us know, in 
advance, the support To of xq. With this additional information, the problem is well-posed 
and one could reconstruct xq by the method of Least-Squares for example, 


x = 


(A* To A To )-'A* To y 

0 


on T 0 
elsewhere. 


In the absence of any other information, one could easily argue that no method would 
exhibit a fundamentally better performance. Now of course, x — xo = 0 on the complement 
of To while on To 

x-x 0 = {At 0 A To ) -1 A^ 0 e, 

and since by hypothesis, the eigenvalues of A^ q At 0 are well-behaved 2 

l|®-®o ||/ 2 « \\A* To e\\t > 2 « e, 

at least for perturbations concentrated in the row space of At 0 ■ In short, obtaining a 
reconstruction with an error term whose size is guaranteed to be proportional to the noise 
level is the best one can hope for. 

Remarkably, not only can we recover sparse input vectors but one can also stably recover 
approximately sparse vectors, as we have the following companion theorem. 

2 Observe the role played by the singular values of At 0 in the analysis of the oracle error. 
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Theorem 2 Suppose that xq is an arbitrary vector in M m and let xq t s be the truncated 
vector corresponding to the S largest values of xq (in absolute value). Under the hypothesis 
of Theorem Q the solution x$ to (P 2 ) obeys 


W - X 0 \\e 2 < C 1} s ■ e + C‘ 


2 ,s 


Iko - ®0,slUi 

V5 


( 6 ) 


For reasonable values of 64 s the constants in © are well behaved; e.g. Ci t s ~ 12.04 and 
C\. S « 8.77 for 64 s = 1/5. 


Roughly speaking, the theorem says that minimizing i\ stably recovers the S'-largest entries 
of an m-dimensional unknown vector x from n measurements only. 

We now specialize this result to a commonly discussed model in mathematical signal pro¬ 
cessing, namely, the class of compressible signals. We say that xq is compressible if its 
entries obey a power law 

kol(fe) < Cr ■ k~\ (7) 

where |xo|(fc) is the kth largest value of xq (|xo|(i) > |xo|( 2 ) > ••• > |®o|(m))j r > U and 
C r is a constant which depends only on r. Such a model is appropriate for the wavelet 
coefficients of a piecewise smooth signal, for example. If xq obeys 0, then 

IN ~ ZQ.sllli ^ q / ^t—r+1/2 

Vs ~ r ' 

Observe now that in this case 

||so - *o,5lk <C':-S~ r+1 /\ 

and for generic elements obeying 0 , there are no fundamentally better estimates available. 
Hence, we see that with n measurements only, we achieve an approximation error which is 
almost as good as that one would obtain by knowing everything about the signal xo an d 
selecting its S'-largest entries. 

As a last remark, we would like to point out that in the noiseless case, Theorem [5] improves 
upon an earlier result from Candes and Tao, see also [8]; it is sharper in the sense that 1) 
this is a deterministic statement and there is no probability of failure, 2) it is universal in 
that it holds for all signals, 3) it gives upper estimates with better bounds and constants, 
and 4) it holds for a wider range of values of S. 


1.3 Examples 

It is of course of interest to know which matrices obey the uniform uncertainty principle 
with good isometry constants. Using tools from random matrix theory, [3,5,10] give several 
examples of matrices such that © holds for S on the order of n to within log factors. 
Examples include (proofs and additional discussion can be found in [5]): 

• Random matrices with i.i.d. entries. Suppose the entries of A are i.i.d. Gaussian with 
mean zero and variance 1/n, then [5,10,17] show that the condition for Theorem Q] 
holds with overwhelming probability when 

S < C ■ n/ log (m/n). 
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In fact, [4] gives numerical values for the constant C as a function of the ratio n/m. 
The same conclusion applies to binary matrices with independent entries taking values 
±1 /y/n with equal probability. 

• Fourier ensemble. Suppose now that A is obtained by selecting n rows from the 
m X m discrete Fourier transform and renormalizing the columns so that they are 
unit-normed. If the rows are selected at random, the condition for Theorem El holds 
with overwhelming probability for S < C ■ n/(logm) 6 [5]. (For simplicity, we have 
assumed that A takes on real-valued entries although our theory clearly accommodates 
complex-valued matrices so that our discussion holds for both complex and real-valued 
Fourier transforms.) 

This case is of special interest as reconstructing a digital signal or image from incom¬ 
plete Fourier data is an important inverse problem with applications in biomedical 
imaging (MRI and tomography), Astrophysics (interferometric imaging), and geo¬ 
physical exploration. 

• General orthogonal measurement ensembles. Suppose A is obtained by selecting n 
rows from an m by m orthonormal matrix U and renormalizing the columns so that 
they are unit-normed. Then [5] shows that if the rows are selected at random, the 
condition for Theorem ^ holds with overwhelming probability provided 

S<C'~- 7r -%f, (8) 

^ (logm) D 

where fi := y/rn maxj j \Uij\. Observe that for the Fourier matrix, fi = 1, and thus 
© is an extension of the Fourier ensemble. 

This fact is of significant practical relevance because in many situations, signals of 
interest may not be sparse in the time domain but rather may be (approximately) 
decomposed as a sparse superposition of waveforms in a fixed orthonormal basis T; 
e.g. in a nice wavelet basis. Suppose that we use as test signals a set of n vectors taken 
from a second orthonormal basis 4>. We then solve (Pi) in the coefficient domain 

(P[) min ||asubject to Aa = y, 

where A is obtained by extracting n rows from the orthonormal matrix U = TT*. The 
recovery condition then depends on the mutual coherence y, between the measurement 
basis <f> and the sparsity basis 'F which measures the similarity between <F and T; 
/i(4>,T) = yfm max \(<j) k ,ip j )\, 4 > k <E 4>, ipj <E T. 

1.4 Prior work and innovations 

The problem of recovering a sparse vector by minimizing t\ under linear equality constraints 
has recently received much attention, mostly in the context of Basis Pursuit , where the goal 
is to uncover sparse signal decompositions in overcomplete dictionaries. We refer the reader 
to [11,13] and the references therein for a full discussion. 

We would especially like to note two works by Donoho, Elad, and Temlyakov [12], and Tropp 
[18] that also study the recovery of sparse signals from noisy observations by solving (P 2 ) 
(and other closely related optimization programs), and give conditions for stable recovery. 
In [12], the sparsity constraint on the underlying signal xq depends on the magnitude of 
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the maximum entry of the Gram matrix M(A ) = maxjj : j-^j \(A*A)\ij. Stable recovery 
occurs when the number of nonzeros is at most (M^ 1 + l)/4. For instance, when A is a 
Fourier ensemble and n is on the order of m, we will have M at least of the order 1 /y/n 
(with high probability), meaning that stable recovery is known to occur when the number 
of nonzeros is about at most 0(^/n). In contrast, the condition for Theorem will hold 
when this number is about n/(logm) 6 , due to the range of support sizes for which the 
uniform uncertainty principle holds. In [18], a more general condition for stable recovery 
is derived. For the measurement ensembles listed in the previous section, however, the 
sparsity required is still on the order of yjn in the situation where n is comparable to m. In 
other words, whereas these results require at least 0 (y/m) observations per unknown, our 
results show that—ignoring log-like factors—only 0(1) are, in general, sufficient. 

More closely related is the very recent work of Donoho [9] who shows a version of © in 
the case where A e R nxm i s a Gaussian matrix with n proportional to m, with unspecified 
constants for both the support size and that appearing in ©• Our main claim is on a 
very different level since it is (1) deterministic (it can of course be specialized to random 
matrices), and (2) widely applicable since it extends to any matrix obeying the condition 
83 s T 3^4S < 2. In addition, the argument underlying Theorem|T]is short and simple, giving 
precise and sharper numerical values. Finally, we would like to point out connections with 
fascinating ongoing work which develops fast randomized algorithms for sparse Fourier 
transforms [14, 19]. Suppose xq is a fixed vector with |To| nonzero terms, for example. 
Then [14] shows that it is possible to randomly sample the frequency domain | To | poly (log m) 
times (poly(log m) denotes a polynomial term in logm), and reconstruct xq from these 
frequency data with positive probability. We do not know whether these algorithms are 
stable in the sense described in this paper, and whether they can be modified to be universal, 
i.e. reconstruct all signals of small support. 

2 Proofs 

2.1 Proof of Theorem m sparse case 

The proof of the theorem makes use of two geometrical special facts about the solution x N 
to (P 2 ). 

1. Tube constraint. First, Ax$ is within 2e of the “noise free” observations Axo thanks 
to the triangle inequality 

II Ax s - Ax 0 \\i 2 < \\Ax i - y\\e 2 + \\Ax 0 - y\\e 2 < 2e. (9) 

Geometrically, this says that is known to be in a cylinder around the n-dimensional 
plane Ax o- 

2. Cone constraint. Since xo is feasible, we must have < ||xolk- Decompose x^ 

as x^ = .To + h. As observed in [13] 

ll*olk - II^Tolk + IK c lk - H x o + h \Ui - Ikolk, 

where To is the support of xo, and hr 0 (t) = h(t) for t G To and zero elsewhere 
(similarly for h^)- Hence, h obeys the cone constraint 

\\hT§\\h < \\hToWh (10) 
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Figure 1: Geometry in M 2 . Here , the point Xq is a vertex of the f\ ball and the shaded area 
represents the set of points obeying both the tube and the cone constraints. By showing that every 
vector in the cone of descent at xq is approximately orthogonal to the nullspace of A, we will ensure 
that x$ is not too far from xq- 


which expresses the geometric idea that h must lie in the cone of descent of the 
^i-norm at xq. 


Figure ^ illustrates both these geometrical constraints. Stability follows from the fact that 
the intersection between 0 (||A/i||f 2 < 2e) and (ITUll is a set with small radius. The reason 
why this holds is because every vector h in the £i-cone CH is approximately orthogonal 
to the nullspace of A. We shall prove that ||^4 /i||^ 2 ~ \\h\\t 2 and together with this 
establishes the theorem. 

We begin by dividing Tq into subsets of size M (we will choose M later) and enumerate Tq 
as ni, 7i2, • • •, n/v-jTo| i n decreasing order of magnitude of hp§- Set Tj = {ri£, (j — l)M +1 < 
i < jM}. That is, T± contains the indices of the M largest coefficients of hpg, T‘i contains 
the indices of the next M largest coefficients, and so on. 

With this decomposition, the t^-norm of h is concentrated on Toi = To U T\. Indeed, the 
tli largest value of hpg obeys 

\hT§\(k) < \\hT§\\eJk 

and, therefore, 

m 

\\hT Sl \\l < \\h TS \\i E l /b 2 ^ \\ h T§\\l/ M - 

k=M +1 

Further, the £i-cone constraint gives 

ll^illi < WhToWjjM < \\h To \\l-\T 0 \/M 
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and thus 


2 

fe 


( 11 ) 


= IIKillfe + IIK 


112 

life 


< (l + \T 0 \/M)-\\h Toi \\l. 


Observe now that 


\\ Ah \\h = \\ a t 01 hr bi + ^2 A Tj h T j \\i 2 > \\A Toi h Toi \\e 2 ~ 1| ^ A Tj h Tj \\e 2 
j >2 j >2 

> \\A Toi h Toi |U 2 “ X! II A’Tjhi'j ||f 2 

j> 2 

> a/ 1 ~ $M+\Tq\ HKi life ~ a/1 + ^M ^ 11 ^Tj 11 fe • 

j> 2 

Set p = \Tq\/M. As we shall see later, 

ElIM* - vK UK life (12) 

3> 2 


which gives 

||A/l||^ 2 > C'lTol.M • IKi life; C\t 0 \,M '■= \J 1 “ <W+|T 0 | “ K V ^ (13) 

It then follows from CEB and ||A/?,||g 2 < 2 e that 

llftk < A+7 ■ i|kT„,|]& < ■ i|Afc|| (2 < yiH . e , (i4) 

fe|T 0 |,M L \T 0 \,M 

provided that the denominator is of course positive. 

We may specialize d and take M = 3|7o|. The denominator is positive if < 53 |t 0 |+ 3 < 54 |t 0 | <2 
(this is true if <5 4 | To < 1/2, say) which proves the theorem. Note that if ^45 is a little smaller, 
the constant in |Sj) is not large. For 64 s < 1/5, Cs ~ 8.82, while for ^45 < 1/4, Cs ~ 10.47 
as claimed. 

It remains to argue about m- Observe that by construction, the magnitude of each 
coefficient in T ) + 1 is less than the average of the magnitudes in Tj: 

\hr j+1 (i)l < \\h Tj hJM. 

Then 

IIK+i life < \\h Tj \^tjM 

and m follows from 

EIIKlife < ElIKIIfe/v^ < II^Tollfe/VM^KW^-IIKIIfe- 

j>2 j>1 


2.2 Proof of Theorem [2} general case 

Suppose now that xo is arbitrary. We let To be the indices of the largest |To| coefficients 
of xo (the value |To| will be decided later) and just as before, we divide up Tq into sets 
of equal size |X) = M, j > 1, by decreasing order of magnitude. The cone 
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constraint m may not hold but a variation does. Indeed, x = xq + h is feasible and, 
therefore, 


11 Zo, To Ik - WhToWh - lko,T 0 c |ki + IIM* - ll X °- T o + ^Tolki + ||® 0 ,T 0 C + h T § |k < H^olUi , 


which gives 


IIM* < WhToWh + 2\\x 0 ,T§Wh- 


(15) 


The rest of the argument now proceeds essentially as before. First, h is in the some sense 
concentrated on Tqi = Tq U T\ since with the same notations 


I \hrt 


01 1142 — 


WhTpWh + 2 W x 0,T§\\h 

Vm 


< VP' \ WhToWh + 


2 lko,T 0 c lki 

V\n\ 


which in turn implies 

IHk < (! + Vp)W h T 0 iWh + 2 \fp- r i > v ■= W x o,T$Wh/VWo\- 


(16) 


Better estimates via Pythagoras’ formula are of course possible (see DUO but we ignore 
such refinements in order to keep the argument as simple as possible. Second, the same 
reasoning as before gives 


E IMIfe < < Vp- (IIM& + 2.,) 


and thus 

ll-^lk ^ C|T 0 |,M ' IkToi Ik ~ 2 VP 'J 1 + &M ' V, 
where C\t 0 lm is the same as in (11.'ll) . Since ||^4/i|| < 2e, we again conclude that 

2 _ 

||%oi IU 2 ^ 7 ?-‘ ( e + yfpyjl + $M v)i 

°|T 0 |,M 

(note that the constant in front of the e factor is the same as in the truly sparse case) and 
the claim © follows from uni. Specializing the bound to M = 3|7o| and assuming that 
5s < 1/5 gives the numerical values reported in the statement of the theorem. 


3 Numerical Examples 

This section illustrates the effectiveness of the recovery by means of a few simple numerical 
experiments. Our simulations demonstrate that in practice, the constants in m and © 
seem to be quite low. 

Our first series of experiments is summarized in Tablesnand[21 In each experiment, a length 
1024 signal was measured with the same 300 x 1024 Gaussian measurement ensemble. The 
measurements were then corrupted by additive white Gaussian noise: y & = (xo, a*,) + e*, 
with ejt ~ A7(0, a 2 ) for various noise levels a. The squared norm of the error \\e llf 2 is a 
chi-square random variable with mean cr 2 n and standard deviation cr 2 \/2n; owing to well 
known concentration inequalities, the probability that ||e|| 2 2 exceeds its mean plus two or 
three standard deviations is small. We then solve (P 2 ) with 

e 2 = cr 2 (n + A\/2n) (17) 
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and select A = 2 although other choices are of course possible. 

Table 0 charts the results for sparse signals with 50 nonzero components. Ten signals were 
generated by choosing 50 indices uniformly at random, and then selecting either —1 or 1 at 
each location with equal probability. An example of such a signal is shown in Figure Ufa). 
Previous experiments [3] have demonstrated that we were empirically able to recover such 
signals perfectly from 300 noiseless Gaussian measurements, which is indeed the case for 
each of the 10 signals considered here. The average value of the recovery error (taken over 
the 10 signals) is recorded in the bottom row of Table 0 I n this situation, the constant in 
© appears to be less than 2. 

Table 0 charts the results for 10 compressible signals whose components are all non-zero, 
but decay as in 0 . The signals were generated by taking a fixed sequence 

Xsort(t) = (5.819) -f 10 / 0 , (18) 

randomly permuting it, and multiplying by a random sign sequence (the constant in (1181) 
was chosen so that the norm of the compressible signals is the same — \/50 — as the 
sparse signals in the previous set of experiments). An example of such a signal is shown in 
Figure 0c). Again, 10 such signals were generated, and the average recovery error recorded 
in the bottom row of Table 0 

For small values of a, the recovery error is dominated by the approximation error — the 
second term on the right hand side of ©• As a reference, the 50 term nonlinear approxi¬ 
mation errors of these compressible signals is around 0.47; at low signal-to-noise ratios our 
recovery error is about 1.5 times this quantity. As the noise power gets large, the recovery 
error becomes less than e, just as in the sparse case. 

Finally, we apply our recovery procedure to realistic imagery. Photograph-like images, 
such as the 256 x 256 pixel Boats image shown in Figure 0a), have wavelet coefficient 
sequences that are compressible (see [7]). The image is a 65536 dimensional vector, making 
the standard Gaussian ensemble too unwieldy 3 . Instead, we make 25000 measurements 
of the image using a scrambled real Fourier ensemble; that is, the test functions ak{t) are 
real-valued sines and cosines (with randomly selected frequencies) which are temporally 
scrambled by randomly permuting the m time points. In other words, this ensemble is 
obtained from the (real-valued) Fourier ensemble by a random permutation of the columns. 
For our purposes here, the test functions behave like a Gaussian ensemble in the sense that 
from n measurements, one can recover signals with about n/5 nonzero components exactly 
from noiseless data. There is a computational advantage as well, since we can apply A and 
its adjoint A T to an arbitrary vector by means of an m point FFT. To recover the wavelet 
coefficients of the object, we simply solve 

(P 2 ) min HaH^ subject to ||AVF*a — y\\i 2 < e, 

where A is the scrambled Fourier ensemble, and W is the discrete Daubechies-8 orthogonal 
wavelet transform. 

We will attempt to recover the image from measurements perturbed in two different man¬ 
ners. First, as in the ID experiments, the measurements were corrupted by additive white 
Gaussian noise with o = 5 TO” 4 so that a-\fn = .0791. As shown in FigureHJ the noise level 

3 Storing a double precision 25000 x 65536 matrix would use around 13.1 gigabytes of memory, about the 

capacity of three standard DVDs. 
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is significant; the signal-to-noise ratio is ||Acoll^/IMI^ = 4.5. With e = .0798 as in (1171) . 
the recovery error is \\c$ — c^o|U 2 = 0.1303 (the original image has unit norm). For compari¬ 
son, the 5000 term nonlinear approximation error for the image is ||ao,5000 — «o ||£ 2 = 0-050. 
Hence the recovery error is very close to the sum of the approximation error and the size 
of the perturbation. 

Another type of perturbation of practical interest is round-off or quantization error. In 
general, the measurements cannot be taken with arbitrary precision, either because of limi¬ 
tations inherent to the measuring device, or that we wish to communicate them using some 
small number of bits. Unlike additive white Gaussian noise, round-off error is deterministic 
and signal dependent—a situation our methodology deals with easily. 

The round-off error experiment was conducted as follows. Using the same scrambled 
Fourier ensemble, we take 25000 measurements of Boats , and round (quantize) them to 
one digit (we restrict the values of the measurements to be one of ten preset values, equally 
spaced). The measurement error is shown in Figure 0(c), and the signal-to-noise ratio is 
||Axo||£ 2 /||e ||^ 2 = 4.3. To choose e, we use a rough model for the size of the perturbation. 
To a first approximation, the round-off error for each measurement behaves like a uniformly 
distributed random variable on (—q/2, q/2), where q is the distance between quantization 
levels. Under this assumption, the size of the perturbation ||e ||| 2 would behave like a sum 
of squares of uniform random variables 

n 

Y = X k~ Uniform (-|, |) . 

k=\ 

Here, mean(Y) = nq 2 /12 and std(Y) = \/nq 2 /[ 6 Vb]. Again, Y is no larger than mean(Y) + 
Astd(Y) with high probability, and we select 

e 2 = nq 2 / 12 + X\/nq 2 /[6a/5], 


where as before, A = 2. The results are summarized in the second column of Table 01 As 
in the previous case, the recovery error is very close to the sum of the approximation and 
measurement errors. Also note that despite the crude nature of the perturbation model, 
an accurate value of e is chosen. 


Although the errors in the recovery experiments summarized in the third row of Table 01 
are as we hoped, the recovered images tend to contain visually displeasing high frequency 
oscillatory artifacts. To address this problem, we can solve a slightly different optimization 
problem to recover the image from the same corrupted measurements. In place of (Pf), we 
solve 

(TV) min ||x||tv subject to ||Ax — y\\e 2 < e 


where 

\\x\\tv = ''f ^ yj (^i+ij ®ij) 2 T (*U,j+1 ^Uj) 2 = y ' |(Vx)jj| 

i,j ij 

is the total variation [16] of the image x: the sum of the magnitudes of the (discretized) gra¬ 
dient. By substituting (TV) for (Pf)-, we are essentially changing our model for photograph¬ 
like images. Instead of looking for an image with a sparse wavelet transform that explains 
the observations, program (TV) searches for an image with a sparse gradient (i.e. without 
spurious high frequency oscillations). In fact, it is shown in [3] that just as signals which are 
exactly sparse can be recovered perfectly from a small number of measurements by solving 
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Table 1: Recovery results for sparse ID signals. Gaussian white noise of variance o 2 was added to 
each of the n = 300 measurements, and (P 2 ) was solved with e chosen such that ||e||2 < e with high 
probability (see (TT7I ) ). 


C7 

0.01 

0.02 

0.05 

0.1 

0.2 

0.5 

e 

0.19 

0.37 

0.93 

1.87 

3.74 

9.34 

x* — Xo||2 

0.25 

0.49 

1.33 

2.55 

4.67 

6.61 


Table 2: Recovery results for compressible ID signals. Gaussian white noise of variance a 2 was 
added to each measurement, and (P 2 ) was solved with e as in (TT7I) . 


a 

0.01 

0.02 

0.05 

0.1 

0.2 

0.5 

e 

0.19 

0.37 

0.93 

1.87 

3.74 

9.34 

H 

1 

O 

to 

0.69 

0.76 

1.03 

1.36 

2.03 

3.20 


(P2) with e = 0, signals with gradients which are exactly sparse can be recovered by solving 
(TV) (again with e = 0). 

Figure Hb) and (c) and the fourth row of Table 01 show the (TV) recovery results. The 
reconstructions have smaller error and do not contain visually displeasing artifacts. 


4 Discussion 

The convex programs (P 2 ) and (TV) are simple instances of a class of problems known as 
second-order cone programs (SOCP’s). As an example, one can recast (TV) as 

min E Ui,j subject to - Uij < \\GijxWh < u itj , || Ax - b \\ l2 < e, (19) 
hi 

where Gijx = (xi+ij — Xjy, x i,j+\ — x i,j) [15]. SOCP’s can nowadays be solved efficiently 
by interior-point methods [1] and, hence, our approach is computationally tractable. 

From a certain viewpoint, recovering via (P 2 ) is using a priori information about the nature 
of the underlying image, i.e. that it is sparse in some known orthobasis, to overcome 
the shortage of data. In practice, we could of course use far more sophisticated models 
to perform the recovery. Obvious extensions include looking for signals that are sparse 


Table 3: Image recovery results. Measurements of the Boats image were corrupted in two different 
ways: by adding white noise (left column) with er = 5 • 1CF 4 and by rounding off to one digit (right 
column). In each case, the image was recovered in two different ways: by solving (P)) (third row) 
and solving (TV) (fourth row). The (TV) images are shown in Figure[3| 



White noise 

Round-off 

l|e|k 

0.0789 

0.0824 

e 

0.0798 

0.0827 

\\ofi ~ a 0 \\h 

0.1303 

0.1323 

II^TV — a 0 life 

0.0837 

0.0843 
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(b) 



(d) 


Figure 2: (a) Example of a sparse signal used in the ID experiments. There are 50 non-zero 
coefficients taking values ±1. (b) Sparse signal recovered from noisy measurements with o = 0.05. 
(c) Example of a compressible signal used in the ID experiments, (d) Compressible signal recovered 
from noisy measurements with o = 0.05. 



(a) (b) (c) 


Figure 3: (a) Original 256 x 256 Boats image, (b) Recovery via {TV) from n = 25000 measurements 
corrupted with Gaussian noise, (c) Recovery via {TV) from n = 25000 measurements corrupted 
by round-off error. In both cases, the reconstruction error is less than the sum of the nonlinear 
approximation and measurement errors. 
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Figure 4: (a) Noiseless measurements Ax o of the Boats image, (h) Gaussian measurement error 
with cr = 5-10 -4 in the recovery experiment summarized in the left column of Tabled The signal - 
to-noise ratio is || Aro||^ 2 /||e||^ 2 = 4.5. (c) Round-off error in the recovery experiment summarized 
in the right column of Tabled The signal-to-noise ratio is || Axoll^/INIfe = 4.3. 


in overcomplete wavelet or curvelet bases, or for images that have certain geometrical 
structure. The numerical experiments in Section 0 show how changing the model can result 
in a higher quality recovery from the same set of measurements. 
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